Each of the three intervention files have the same number of observations and the same columns; we row bind all three files to obtain one R object with interventions.
glimpse(interventions)
## Rows: 601,881
## Columns: 46
## $ Mission_ID <int64> 10221520001, 10221520002, 1022152…
## $ Service_Name <chr> "HA UR MECH AZ St Maarten", "BA KAP…
## $ PostalCode_permanence <dbl> 2800, 2950, 2060, 2140, 2110, 2340,…
## $ CityName_permanence <chr> "Mechelen (Mechelen)", "Kapellen (K…
## $ StreetName_permanence <chr> "Liersesteenweg", "Essenhoutstraat"…
## $ HouseNumber_permanence <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ Latitude_permanence <dbl> 51.05102, 51.31208, 51.22249, 51.21…
## $ Longitude_permanence <dbl> 4.478030, 4.424398, 4.436290, 4.443…
## $ `Permanence_short name` <chr> "AAMECH01A", "AAKAPE01A", "UAANTW01…
## $ `Permanence_long name` <chr> "ZW MECHELEN 1", "ZW KAPELLEN 1", "…
## $ Vector_type <chr> "Ambulance", "Ambulance", "MUG", "A…
## $ EventType_Firstcall <chr> "P020 - Intoxication alcohol", "P06…
## $ EventLevel_Firstcall <chr> "N5", "N5", "N5", "N5", "N5", "N4",…
## $ EventType_Trip <chr> "P020 - Intoxication alcohol", "P06…
## $ EventLevel_Trip <chr> "N5", "N5", "N1", "N5", "N5", "N4",…
## $ PostalCode_intervention <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ CityName_intervention <chr> "Lier (Lier)", "Stabroek (Hoevenen)…
## $ Latitude_intervention <dbl> 51.12496, 51.30626, 51.30626, 51.23…
## $ Longitude_intervention <dbl> 4.57408, 4.40502, 4.40502, 4.44533,…
## $ Province_intervention <chr> "ANT", "ANT", "ANT", "ANT", "ANT", …
## $ T0 <chr> "01JUN22:00:01:34", "01JUN22:00:03:…
## $ T1 <chr> "01JUN22:00:03:26", "01JUN22:00:06:…
## $ T1confirmed <chr> "2022-06-01 00:20:14.417", "2022-06…
## $ T2 <chr> "2022-06-01 00:07:34.655", "2022-06…
## $ T3 <chr> "2022-06-01 00:17:53.888", "2022-06…
## $ T4 <chr> NA, "2022-06-01 00:46:34.089", "202…
## $ T5 <chr> NA, "2022-06-01 00:56:48.871", "202…
## $ T6 <chr> "2022-06-01 00:28:18.934", "2022-06…
## $ T7 <chr> "2022-06-01 00:53:17.876", "2022-06…
## $ T9 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Intervention_time (T1Reported)` <dbl> 14, 10, 13, 15, 10, 10, 9, 11, 5, 7…
## $ `Intervention_time (T1Confirmed)` <dbl> 0, 10, 13, 12, 9, 10, 8, 11, 5, 6, …
## $ Waiting_time <dbl> 16, 13, 32, 19, 15, 12, 14, 16, 7, …
## $ Intervention_duration <dbl> 27, 65, 63, 32, 73, 50, 59, NA, 73,…
## $ `Departure_time (T1Reported)` <dbl> 4, 4, 3, 3, 3, 3, 3, 4, 1, 4, 4, 4,…
## $ `Departure_time (T1Confirmed)` <dbl> 0, 4, 3, 0, 2, 3, 2, 4, 1, 3, 3, 3,…
## $ Unavailable_time <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Name_destination hospital` <chr> NA, "HA UR ANTW Jan Palfijn", NA, N…
## $ `PostalCode_destination hospital` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `CityName_destination hospital` <chr> NA, "Antwerpen (Merksem)", NA, NA, …
## $ `StreetName_destination hospital` <chr> NA, "Lange Bremstraat", NA, NA, "La…
## $ `HouseNumber_destination hospital` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Calculated_travelTime destinatio` <dbl> NA, 678, NA, NA, 530, 859, 838, 284…
## $ `Calculated_Distance destination` <dbl> NA, 11743, NA, NA, 5468, 13231, 958…
## $ `Number_of transported persons` <dbl> NA, 1, NA, NA, 1, 1, 1, 1, 2, 3, NA…
## $ Abandon_reason <chr> "Verzorgd ter plaatse", NA, NA, "Zo…
There are 10 different possible vectors; Ambulance, Ambulance Disaster, Ambulance Event, Ambulance Exceptional, MUG, MUG Disaster, MUG Event, PIT, PIT Disaster, PIT Event. The vast majority are either regular ambulance, MUG of PIT. An ambulance typically has two ambulance operators which are typically not nurses or doctors. These operators cannot administer medications but typically can use and AED or help the patients with oxygen for breathing. A PIT (Pre-hospital Intervention Team) has one a ambulance operator and a nurse that obtained special training. The nurse may administer medications, take an ECG or blood. In a MUG, there is an emergency doctor and a nurse that received additional training. The nurse typically drives the car (often an SUV).
I hypothesize that the permanence indicates the location of the “vector” (PIT/MUG/Ambulance). Latitude and longitude are missing in 2.41 % of the interventions while the postal code is missing in only 0.43% of the interventions. No coordinate system is specified for the longitude and latitude. The range of longitude (2.58895 to 612.739) or latitude (49.53729 to 513.118) does not match with either WGS84, Lambert72, or Lambert2008 coordinate bounds (these are probably the most often used coordinate systems for Belgium). We therefor use the tidygeocoder package to obtain the coordinates of the address.
The figure below shows that this package works well (click on the markers to see the original address) although the coordinates are not found for all addresses (there are some NAs).
However, obtaining the geolocation is quite slow. It takes about 500 seconds to obtain 1000 geolocations in parallel (7 parallel cores). Since there are only 489 unique permanence addresses, it was still feasible to obtain these. It should be noted that 100 (20.45%) geolocations were not found using the tidygeocoder package.
The coordinates of the tidygeocoder package and the specified seem to match quite well, except for a couple of obvious mistakes. The figure below shows the logitude and latitude according to tidygeocoder and according to the raw data. The red lines show the WGS84 bounds for Belgium. Most nonsensical raw longitude/latitude data points seem to be on a straight. We can hypothesize that these numbers should in fact be 10 or 100 times smaller.
Indeed, if we correct the raw data by dividing the lon/lat by either 10 or 100 (depending on the value), all permanence locations lie within the red boundaries and are approximately the same as the tidygeocode locations (see figure below).
Similar to the permanence locations, we see that the longitude and latitude contain some nonsensical values. We correct those values accordingly.
## Warning: Removed 815 rows containing missing values (`geom_point()`).
A random sample of 5000 intervention locations with the raw, uncorrected longitude and latitude. The red lines show the bounding box of Belgium for the WGS84 coordinate system.
## Warning: Removed 829 rows containing missing values (`geom_point()`).
A random sample of 5000 intervention locations with corrected longitude and latitude. The red lines show the bounding box of Belgium for the WGS84 coordinate system.
## Breaking News: tmap 3.x is retiring. Please test v4, e.g. with
## remotes::install_github('r-tmap/tmap')
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.